14 research outputs found

    Numeracy of Language Models: Joint Modelling of Words and Numbers

    Get PDF
    Numeracy and literacy are the abilities to understand and work with numbers and words, respectively. While both skills are necessary for reading and writing documents in clinical, scientific, and other technical domains, existing statistical language models focus on words to the expense of numbers: numbers are ignored, masked, or treated similarly to words, which can obscure numerical content and cause sparsity issues, e.g. high out-of-vocabulary rates. In this thesis, we investigate whether the performance of neural language models can be improved by i) considering numerical information as additional inputs and ii) explicitly modelling the output of numerical tokens. In experiments with numbers as input, we find that numerical input features improve perplexity by 33% on a clinical dataset. In assisted text entry and verification tasks, numerical input features improve recall from 25.03% to 71.28% for word prediction with a list of 5 suggestions, keystroke savings from 34.35% to 44.81% for word completion, and F1 metric by 5 points for semantic error correction. Numerical information from an accompanying knowledge base helps improve performance further. In experiments with numerical tokens as output, we consider different strategies, e.g. memorisation and digit-by-digit composition, and propose a novel neural component based on Gaussian mixture density estimation. We propose the use of regression metrics to evaluate numerical accuracy and an adjusted perplexity metric that accounts for the high out-of-vocabulary rate of numerals. Our evaluation on clinical and scientific datasets shows that perplexity can be improved by more than 2 and 4 orders of magnitude, respectively, by modelling words and numerals with different sub-models through a hierarchical softmax. For the same datasets, our proposed mixture of Gaussians model achieved a 32% and 54% reduction of mean average percentage errors over the contender strategy, digit-by-digit composition. We conclude with a critical reflection of this thesis and suggestions for future work

    Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers

    Get PDF
    Numeracy is the ability to understand and work with numbers. It is a necessary skill for composing and understanding documents in clinical, scientific, and other technical domains. In this paper, we explore different strategies for modelling numerals with language models, such as memorisation and digit-by-digit composition, and propose a novel neural architecture that uses a continuous probability density function to model numerals from an open vocabulary. Our evaluation on clinical and scientific datasets shows that using hierarchical models to distinguish numerals from words improves a perplexity metric on the subset of numerals by 2 and 4 orders of magnitude, respectively, over non-hierarchical models. A combination of strategies can further improve perplexity. Our continuous probability density function model reduces mean absolute percentage errors by 18% and 54% in comparison to the second best strategy for each dataset, respectively.Comment: accepted at ACL 201

    Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation

    Get PDF
    Abstract The popularity of image sharing on social media reflects the important role visual context plays in everyday conversation. In this paper, we present a novel task, ImageGrounded Conversations (IGC), in which natural-sounding conversations are generated about shared photographic images. We investigate this task using training data derived from image-grounded conversations on social media and introduce a new dataset of crowd-sourced conversations for benchmarking progress. Experiments using deep neural network models trained on social media data show that the combination of visual and textual context can enhance the quality of generated conversational turns. In human evaluation, a gap between human performance and that of both neural and retrieval architectures suggests that IGC presents an interesting challenge for vision and language research

    Theoretical analysis and empirical study of the application of ADIDA forecasting methodology to non-intermittent demand data

    No full text
    134 σ.Περιλαμβάνει CD με κώδικα MatlabΗ εργασία μελετά θεωρητικά και εμπειρικά την εφαρμογή μεθοδολογίας προβλέψεων ADIDA, μίας προσέγγισης συνάθροισης-διάσπασης για παραγωγή προβλέψεων που προορίζεται για δεδομένα διακοπτόμενης ζήτησης. Η απόδοση της αξιολογείται με έμφαση στην πρόβλεψη δεδομένων συνεχούς ζήτησης. Εισαγωγικά, παρουσιάζονται έννοιες και εφαρμογές των προβλέψεων. Αναπτύσσονται τα χαρακτηριστικά των χρονοσειρών και η διαδικασία αποσύνθεσης σε επιμέρους συνιστώσες. Μετά από την εκτενή καταγραφή μεθόδων πρόβλεψης, με έμφαση στην εκθετική εξομάλυνση, παρουσιάζονται σφάλματα μέτρησης της ακρίβειας και σκιαγραφείται, πραγματοποιώντας μία ιστορική αναδρομή, η σκοπιμότητα της αξιολόγησης μεταξύ των μεθόδων πρόβλεψης μέσω διαγωνισμών. Μέσω θεωρητικής ανάλυσης, η μεθοδολογία διασπάται σε απλούστερα επιμέρους στάδια, για τα οποία παρέχεται ξεχωριστή μελέτη. Ως προς την πρακτική εφαρμογή, δείχνονται τα θετικά αποτελέσματα στην πρόβλεψη χρονοσειρών διακοπτόμενης ζήτησης. Αναγνωρίζονται τα κενά στην πειραματική εφαρμογή σε δεδομένα συνεχούς ζήτησης και έπειτα ελέγχεται η εξάρτηση της απόδοσης της μεθοδολογίας από μηχανισμούς διαχείρισης της εποχικότητας. Στη συνέχεια, εξετάζεται θεωρητικά η επίδραση του επιπέδου συνάθροισης και δοκιμάζεται πειραματικά η βελτιωτική απόδοση της μεθοδολογίας, όπου ελέγχονται οι προσεγγίσεις χρήσης κοινού στο σύνολο των δεδομένων ή ξεχωριστού ανά χρονοσειρά επιπέδου συνάθροισης. Για τη δεύτερη προσέγγιση, εξετάζονται τεχνικές εκτίμησης του βέλτιστου επιπέδου και υπολογίζονται τα κατώτερα όρια σφάλματος. Επιπροσθέτως, μετά από λεπτομερή βιβλιογραφική έρευνα, περιγράφονται και μελετώνται εμπειρικά διάφοροι αλγόριθμοι διάσπασης. Στο τέλος, τα συμπεράσματα συνοψίζονται αποκαλύπτοντας το δυναμικό της μεθοδολογίας για μείωση των σφαλμάτων πρόβλεψης και δίνονται οι κατευθύνσεις για περαιτέρω έρευνα στα ερωτήματα που προκύπτουν.This study considers theoretical and empirical aspects of the application of ADIDA methodology, an aggregation-disaggregation approach to forecasting intermittent demand data. Its performance is assessed with emphasis on its use for forecasting non-intermittent demand data. Initially, concepts and applications of forecasting are presented. The characteristics of time series and the process of decomposition into individual components are analyzed. After the description of forecasting methods, with emphasis on exponential smoothing, errors for accuracy measurement are presented and the purpose of the evaluation of forecasting methods is outlined, through a retrospection of forecasting competitions. Through theoretical analysis, the methodology is broken down into simpler sub-stages, which are studied separately. With respect to practical application, positive results for predicting intermittent demand time series are shown. Scientific gaps in the experimental use on non-intermittent demand data are detected and the dependence of the methodology performance on mechanisms to handle seasonality is inquired. Subsequently, the effect of aggregation level is theoretically examined and the improving effect of the methodology is experimentally inspected by testing two different approaches: using a fixed aggregation level across series or using separate aggregation level per series. For the second approach, techniques to estimate the optimal level are studied and the lower error bounds are calculated. Furthermore, after a thorough search in literature, various disaggregation algorithms are empirically studied. In the end, the conclusions are summarized, revealing the potential of the methodology for forecast error reduction and directions for further research are proposed.Γεώργιος Π. Σπιθουράκη

    Numerically Grounded Language Models for Semantic Error Correction

    No full text
    Semantic error detection and correction is an important task for applications such as fact checking, speech-to-text or grammatical error correction. Current approaches generally focus on relatively shallow semantics and do not account for numeric quantities. Our approach uses language models grounded in numbers within the text. Such groundings are easily achieved for recurrent neural language model architectures, which can be further conditioned on incomplete background knowledge bases. Our evaluation on clinical reports shows that numerical grounding improves perplexity by 33% and F1 for semantic error correction by 5 points when compared to ungrounded approaches. Conditioning on a knowledge base yields further improvements.Comment: accepted to EMNLP 201

    NLU++: A Multi-Label, Slot-Rich, Generalisable Dataset for Natural Language Understanding in Task-Oriented Dialogue

    Full text link
    We present NLU++, a novel dataset for natural language understanding (NLU) in task-oriented dialogue (ToD) systems, with the aim to provide a much more challenging evaluation environment for dialogue NLU models, up to date with the current application and industry requirements. NLU++ is divided into two domains (BANKING and HOTELS) and brings several crucial improvements over current commonly used NLU datasets. 1) NLU++ provides fine-grained domain ontologies with a large set of challenging multi-intent sentences, introducing and validating the idea of intent modules that can be combined into complex intents that convey complex user goals, combined with finer-grained and thus more challenging slot sets. 2) The ontology is divided into domain-specific and generic (i.e., domain-universal) intent modules that overlap across domains, promoting cross-domain reusability of annotated examples. 3) The dataset design has been inspired by the problems observed in industrial ToD systems, and 4) it has been collected, filtered and carefully annotated by dialogue NLU experts, yielding high-quality annotated data. Finally, we benchmark a series of current state-of-the-art NLU models on NLU++; the results demonstrate the challenging nature of the dataset, especially in low-data regimes, the validity of `intent modularisation', and call for further research on ToD NLU.Comment: 16 pages, 1 figure, 10 tables. Accepted in NAACL 2022 (Findings
    corecore